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1. INTRODUCTION 

Information retrieval is the activity of obtaining information resources relevant to an information 
need from a collection of information resources [1]. Automated information retrieval systems are used to 
reduce what has been called "information overload". Web search engines are the most visible IR applications. 
An information formal statements of information needs, for example search strings in web search engines. 

Multi Agent Systems (MAS) is considered a pool of information agents. An information agent is an 
agent that has access to one or more information sources, and is able to store and process information 
obtained from these sources in order to answer queries posed by users and other information agents. The 
information sources may be of many types, including web services, web sites, RSS-feeds, and traditional 
databases [2]. 

Fuzzy Metagraph is an emerging technique used in the design of many information processing 
systems like transaction processing systems, decision support systems, and workflow System [3]. 

Zheng-Hua Tan has proposed a Fuzzy Metagraph (FM) based knowledge. The FM has been applied 
to fuzzy rule-based systems for knowledge representation and reasoning in the format of algebraic 
representation and FM closure matrix [4]. A.Thirunavukarasu and Dr.SUmamaheswari have proposed a 
Fuzzy Metagraph based Knowledge representation of Decision Support System (DSS). This method can be 
used in many real world applications like E-commerce, share market and disease analysis [5]. 

To deal with the vagueness typical of human knowledge, the fuzzy set theory [6] can be used to 
manipulate the knowledge in the basis. Knowledge basis in information retrieval cover a wide range of topics 
of which query expansion is one the main aim of query expansion is to add new meaningful terms to the 
initial query. 

In this work we focus in the first stage on analyzing the Automatic Information Retrieval Multi 
agent Modeling based on Fuzzy Metagraph to make user understanding the system model. In the second 
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stage the document relevant result from the model evaluated (ranking score) using cosine similarity in vector 
space model between quires and documents. 

The rest of the paper is organized as follows. Section 2 explains modeling of Multi agent. Section 3 
illustrates based technique and points out fuzzy metagraph information retrieval multi agent modeling 
technique. Section 4 explains the preprocessd model. Section 5 illustrates experimental result and section 6 
concludes the paper. 


2. MULTI AGENT SYSTEM MODELING 

The purpose of the multi-agent system is to aid users in searching and retrieving information 
available on the World Wide Web. A system devoted to perform automatic information retrieval might 
encompass four main steps: (i) Search the World Wide Web with keyword, (ii) Extract the required 
information from web sources (iii) Mining the texts that extracted from the web, (iv) Store the output in 
database [7]. This model consists of three agents’. The first agent searches in the Internet by keywords (query 
words) using search engine Google and returns links by collecting the URLs of the available websites from 
the Internet and stores these URLs into the database. The second agent automatically retrieves document 
from URLs. The third agent implements the following 1) extract useful information from document retrieval 
2) reprocess text using tokenization (to remove all punctuations, special characters and by replacing tabs and 
other non-text characters by single space), remove stop word (to remove words that are not related to the 
documents.) and stemming (is a heuristic process in which the end of the words or the affixes of the 
derivational words are chopped off to receive the base form of the word) [1,7] 3) computes term weight (tf- 
idf) as describe in the following section 2.1. 

Agents are JADE agents capable of (i) interacting exchanging FIPA-ACL messages, (ii) sharing a 
common ontology in accordance with the actual application, and (iii) exhibiting a specific behavior according 
to their role [7]. 


2.1. Vector Space Model 

Vector space model (VSM) is based on interpretation of both, documents andqueries, as points in a 
multidimensional document space [1, 7]. Cosine measure (in equation (1)) that can be interpreted as an angle 
between the query vector and document vector in m-dimensional document space. Similarity of a document 
vector to query vector equal the cosine of the angle between them [1, 8] and is given by equation (1). 
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qis the query vector, di is the ith document vector in the collection, waj is tf-idf weight of term j in the query 
q, Wij is tf-idfweight ( term frequency — inverse term frequency) [1]of term j in the document di. 


Where the two vectors d (document vector) and q (query vector) given by the flowing equation 
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If all the vectors normalized, then the cosine of the angle between two vectors is the same as their 


dot-product. If vector d is the document vector and vector q is the query vector, then the similarity of 
document D to query Q (or score of D for Q) in equation (1) can be represented as: 


sim(q, di)= cos® = Xj Wij X Wq,j (3) 


3. RESULTS AND ANALYSIS (10 PT) 

Fuzzy MetagraphMultiagent Information Retrieval as shown in Figure 1. 

Three agent as shown in Figure 1 connect with other to build fuzzy multi-agent information retrieval 
modelling.The input keyword writes by the user in the stage of user interface and the three agents used to 
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retrieve document from Google. And in this section this agent user interface and multi- agent represented by 
the fuzzy metagraph where every agent is represented by sets. The user interface (UI) is represented by set 
{X, 1}, by it user can enter the query (keywords). Agent 1 is represented by set { Xe Ka X,}.Agent2 is 
represented by set { x. Agent3 is represented by set { x X, Xy x of and the output is the documents 
retrieval indexing in matrix contain terms of each document is represent by { Xj}. 

A triple X in the FUZZY G in Figure 2 represents as: 

X= BA x xy X41 Xs, Xo, X7, Xs, Xo, Xg }. The meanings of the set and variables used in this a triple 
explained in Table 1 


Figure 1. Fuzzy Metagraph for Each Agent Process of Multi-Agent Information Retrieval Molding 


Table 1. Meaning of sets in Figure 5 


Set Variable Meaning 

x, UI User Interface to write keyword 
F, GIS Google Search URL 

X3 CURL Google calculates URL 
X, SURL Store ULR 

Xe RD Retrieve document from URL 
Xe DT DocumentTokenization 
x; DF Document filtrating 

Xa DS Document Stemming 

X TW Term weight calculate 
Xo OE Output evaluation 


The edge set can be specified as: 

ê= {<{X y}, {XX 3 9>), č =< {Šp Xs X ap, (Xs J>E = < Xs, f, {X o Xz, Xs, Xo} eas {Šo X 
a Xa, Xo} {X 10/>-As example,The in-vertex and out-vertex of e`°4 are In-vertex ={ X, X, Š x X o }, out- 
vertex={ Xio). 


The simple path of the fuzzy Metagraph is represented as: 

An important property of graphs is that of connectivity in Figure 1 there is a sequence of edges (é 1; 62,63, È 
4) that connects X, Xio, which means that a path from X, to Xo exists. 

Fuzzy rules are formed from fuzzy metagraph [9]. In this paper according to Figure | the following rules are 
used for fuzzy inference system (FIS) as will describe in the section 5. 

X ;: weighting of term in query (tf-idf) before using membership function, w(q,t) € (0,1 ), w(q,t). 

After membership function u X ; € (xı, u) 

Žo weighting of term in document (tf-idf)) before using membership function, w( t,d), w(t,d) € (0,1 ) after 
membership function u X € (x9, u ), The other sets are not used because they not have quantified values but 
they have qualified value. 


4. PROPOSED MODEL OF FUZZY METAGRAPH MULTIAGENT INFORMATION 
RETRIEVAL DECISION MAKING 
The structure of the proposed model is shown in Figure 2. The multi-agent is represented by fuzzy 
metagraph for simplicity and to understand the function of the multi-agent for retrieve information from 
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Google as a search engine using key word as (computer science) (show Figure 2 and section 3 illustrated 
that). Tthe first stage of the block in Figure 2 shows this. In the second stage the output documents from 
multi-agent after Document Tokenization, filtrating and stemming in agent three (show Table 1 and section 
3) and calculate the weight term w(q,t) and w( t,q). There are two inputs and one output to fuzzy inference 
system [5] and two rules applied at fuzzy inference process. The input weight and rules and membership 
function considered in the third stage to fuzzifier and aggregation. The user can take from the documents 
retrieval what are you need and the output is ranking score documents 


Input (keywords) 


Multi-agent representing by fuzzy metagraph 
which contains user interface and multi-agent 


Output of Multi-agent after calculatingimportant 
values for scoring the documents 


Fuzzy Inference System 


Output document ranking score 


Figure 2. Block Diagram of the Proposed Model 


5. EXPERIMENTAL EVALUATION EXAMPLE 
The experiment evaluation was carried out in MATLAB platform, Mamdani-type FIS and Sugeno- 
type FIS and a sample of the documents retrieval from multi-agent. The experiment was ran to evaluate the 
ranking score of relevant documents using question (3) and keywords query as (computer science) 
The input rule to the Mamdani-type FIS and Sugeno- type FIS arethe following: 
1. If(w (q, ti) is high) and (w (tı, d) is high) then (cosine (q,d) is score high) 
2. If(w (q, tı) is low) and (w (t;, d) is low) then (cosine (q,d) is score low) 
3. If (w (ty, d) is high) and (w (q, t2) is high) then (cosine (q,d) is score high) 
4. If (w (t, d) is low) and (w (q, t2) is low) then (cosine (q,d) is score low) 


Where W (t,q) is weight of term (tf-idf) of the term in the query (Wq) and W(t, d) is the weight of term (1 
idf) of the term j in the document i W (;,;) As in questions (1), (3). Also triangular membership functions was 
used for the linguistic input terms w (q, t1), w (q, t2), w (tı, d) and w (t2, d) as shown in the following example 
in Figure 3. 


FIS Variables Membership function plots 


watt) cosine(q,d) 
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Figure 3. Example Membership Function of Input w (q, t2) 
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5.1. Evaluation by Using Mamdani-Type FIS 

The proposed FIS for the evaluation documents ranking score consists of four inputs ( two for weigh 
term of document and two for weight term in the query) as shown in Figure 4 w(t),d),w(q,t;),w(t2,d),w(q,t2). 
The system has one output that indicates score of document. Each of the selected input and output variables 
is described by a set of two linguistic fuzzy values (low and high) defined by triangle membership function, 
thus allowing the fuzzification procedure to convert the measured numerical value into one of the fuzzy 
values. Figure 3 shows one of the input w(q,t2) triangle membership function and Figure 5 shows output 
score (cosine similarity) triangle membership functions. In the experiment the input processed by Mamdanii- 
type fuzzy inference system using triangle membership function and rules as previous described and as was 
shown in Figure 4. The input for the defuzzification process was the aggregate output fuzzy set (of sum after 
applied rules and the output set was a single number (centroid value) as shown in Figure 5.a and Figure 5.b. 
The document in Figure 5.b was highest score (centroid value 0.666) but the document in Figure 5 a was the 
lowest score (centroid value 0.573). The plots obtained after simulating Mamdani-type of FIS for document 
similarity cosine score were shown in Figures. 5.c and Figure 5.d. 


cosine - similarity 


(mamdani) 


cosine(q,d) 


Figure 5.c Figure 5.d 


Figure 5. Result Experiment Output using Mamdani -type FIS 


5.2. Evaluation by Using Sugeno -Type FIS 

The initial steps and the setting of Sugeno-type FIS are same as of Mamdni-type FIS. It also consists 
of four inputs ( two for weigh term of document and two for weight term in the query) as shown in Figure 6 
w(t),d),w(q,t1),w(ts,d),w(q,t2) and produces one output that indicates the similarity or the ranking score ( 
cosine). Each of the selected input variables is described by a set of two linguistic fuzzy values, defined by 
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triangle membership function as in the case of Mamdani-type fuzzy inference system(as already shown in 
Figure 3). Unlike the output value range of the Mamdani-type fuzzy inference system, the range of Sugeno- 
type output is between 0 and 1.The output of this system can only be either constant or linear in this FIS, so 
two linguistic fuzzy values for the output are “Low”, and “High” which can be constant low score 0 and high 
score 1. The rule base for Sugenotype FIS is the same as for Mamdani-type FIS. In the experiment the input 
processed by Sugeno -type fuzzy inference system using triangle membership function and rules as 
described. The output set was a single number (weighted average) as shown in Figure 7.a and Figure 7.b. The 
document in Figure 7. b was highest score (weighted average 0.999), but the document in Figure 7.a was the 
lowest score ((weighted averag 0.714). The plots obtained after simulating Sugeno -type of FIS for document 
similarity cosine score were shown in Figures 7.c and Figure 7.d. 


cosine —similarity 


(sugeno) 


Figure 6. Sugeno —Fuzzy Type Fuzzy Inference System using Membership Function and Rules 
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Figure 7.b 


Figure 7.c Figure 7.d 


Figure 7. Result Experiment Output using Sugeno -Type FIS 


6. CONCLUSION 

Multi-agent system have presented to retrieves automatically multi-documents (text) and extract the 
useful information from the text information according to the users interests in a web-based environment by 
using keywords. Fuzzy Metagraph for Automatic Information Retrieval Multi agent modeling have been 
analyzed. We presented traditional method of defining the cosine measuring similarity between query and 
document to evaluate the document ranking score. Fuzzy Metagraph for Automatic Information Retrieval 
Multi agent modeling was combined with fuzzy inference system to construct a model for document ranking 
score. The documents ranking score cosine similarity using fuzzy inference system development and 
implemented much simpler than the traditional method which require mathematical equations. It has been 


Multi-agent System for Documents Retrieval and Evaluation Using... (Galina Ivanova) 


164 o ISSN: 2252-8938 


concluded from this paper that for the evaluate document ranking score using similarity method (cosine), 
Mamdani-type FIS and Sugeno-type FIS works similarly. Membership functions and rules are same for both 
the FIS, only difference is that output membership functions for Sugeno-type FIS can only be either constant 
or linear and also the crisp output is generated in different ways for both the FIS. Sugeno-type FIS is better 
results better than Mamdani-type. Both the models are simulated using 4 rules and four input membership 
Functions. Also only one output value (centroid value) is used.in the case of Mamdani-type FIS and (average 
weight value) in the case of Sugeno-type FIs for document ranking score. 
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